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Amendment to the Specification: 

Please amend the specification as follows. 

Please replace the paragraph beginning at page 1 line 1 (inserted in Applicants' 
amendment of July 26, 2002, with the following amended paragraph: 

This application is a Continuation-in-Part application of co-pending U.S. Patent 
Application Serial No. 09/391,340, filed September 7, 1999, issued as U.S. Patent No. 6.492.511, 
December 10. 2002, which is a divisional of II S. Patent Application Serial No. 08/907,166, 
filed August 6, 1 997, now issued as U. S. Patent No. 5,948,666. 

Please replace the paragraph lines 12 to 21, on page 32, with the following 
amended paragraph: 

The invention also provides for the use of proprietary codon primers (containing a 
degenerate N,N,N sequence) to introduce point mutations into a polynucleotide, so as to generate 
a set of progeny polypeptides in which a full range of single amino acid substitutions is 
represented at each amino acid position ([[gene site saturated mutagenesis (GSSM)]] Gene Site 
Saturation Mutagenesis™ ( GSSM™)) - The oligos used are comprised contiguously of a first 
homologous sequence, a degenerate N,N,N sequence, and preferably but not necessarily a second 
homologous sequence- The downstream progeny translational products from the use of such 
oligos include all possible amino acid changes at each amino acid site along the polypeptide, 
because the degeneracy of the N,N,N sequence includes codons for all 20 amino acids. 

Please replace the paragraph of page 59, line 31 to page 61, line 4, with the 
following amended paragraph: 

A "comparison window* 7 , as used herein, includes reference to a segment of any one 
of the number of contiguous positions selected from the group consisting of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence maybe compared 
to a reference sequence of the same number of contiguous positions after the two sequences are 
optimally aligned. Methods of alignment of sequence for comparison are well-known in the art. 
Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology 
algorithm of Smith & Waterman, Adv. Appl. Math. 2:482, 1981, by the homology alignment 
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algorithm of Needleman & Wunsch, J. Mol. Biol 48:443, 1970, by the search for similarity method 
of person & Lipman, Proc. Natl. Acad. Sci. USA 85:2444 7 1 98 8 ? by computerized implementations 
of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software 
Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 
visual inspection. Other algorithms for detenriining homology or identity include, for example, in 
addition to a BLAST program (Basic Local Alignment Search Tool at the National Center for 
Biological Information), ALIGN, AMAS (Analysis of Multiply Aligned Sequences), AMPS 
(Protein Multiple Sequence Alignment), ASSET (Aligned Segment Statistical Evaluation Tool), 
BANDS, BESTSCOR, BIOSCAN (Biological Sequence Comparative Analysis Node), BLIMPS 
(BLocks IMProved Searcher), FASTA, Intervals & Points, BMB, CLUSTAL V 3 CLUSTAL W, 
CONSENSUS, LCONSENSUS, WCONSENSUS, Smith-Waterman algorithm, DARWIN, Las 
Vegas algorithm, FNAT (Forced Nucleotide Alignment Tool), Framealign, Framesearch, 
DYNAMIC, FILTER, FSAP (Fristensky Sequence Analysis Package), GAP (Global Alignment 
Program), GENAL, GIBBS, GenQuest, ISSC (Sensitive Sequence Comparison), LALIGN 
(Local Sequence Alignment), LCP (Local Content Program), MACAW (Multiple Alignment 
Construction & Analysis Workbench), MAP (Multiple Alignment Program), MBLKP, MBLKN, 
PIMA (Pattern-Induced Multi-sequence Alignment), SAGA (Sequence Alignment by Genetic 
Algorithm) and WHAT-IF. Such alignment programs can also be used to screen genome 
databases to identify polynucleotide sequences having substantially identical sequences. A 
number of genome databases are available, for example, a substantial portion of the human 
genome is available as part of the Human Genome Sequencing Project (J. Roach, 
http;//w e b e r.u, Washington. e du/ - roach/humon_ gQnomo__ progrooo 2.htrnl) (Gibb$, 1995). At least 
twenty-one other genomes have already been sequenced, including, for example, M genitalium 
(Fraser et al 7 1995), M. jannaschii (Bult et aL, 1996), H. influenzae (Fleischmann et aL 7 1995), E. 
coli (Blattner et al f 1997), and yeast (51 cerevisiae) (Mewes et al 9 1997), andZ>. melanogaster 
(Adams et aL, 2000). Significant progress has also been made in sequencing the genomes of model 
organism, such as mouse, C elegans, and Arabadopsis sp. Several databases containing genomic 
information annotated with some functional information are maintained by different organization, 
and are accessible via the interne t, for example, http://wwwtigr.org/tdb; 
h^://wAvw.gen e tic$.wiso- e du;http://g e nom e www.5tonford.odu/ ball, http://hiv w e b lanl.gov; 
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Avww ?e cnomQ.wi .mit.odu . 

Please replace the paragraph of page 61, lines 5 to 27, with the following 
amended paragraph: 

One example of a useful algorithm is BLAST and BLAST 2.0 algorithms, which are 
described in Altschul et at, Nuc. Acids Res. 25:3389-3402, 1 977, and Altschul et al, J. Mol. Biol. 
215:403-410, 1990, respectively. Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Information [[(http://www.ncbi.nlmjiih.gov/)]]. 
This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is referred 
to as the neighborhood word score threshold (Altschul et aL, supra). These initial neighborhood 
word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are 
extended in both directions along each sequence for as far as the cumulative alignment score can be 
increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M 
(reward score for a pair of matching residues; always >0). For amino acid sequences, a scoring 
matrix is used to calculate the cumulative score. Extension of the word hits in each direction are 
halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved 
value; the cumulative score goes to zero or below, due to the accumulation of one or more negative- 
scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm 
parameters W, T, and X determine the sensitivity and speed of the alignment The BLASTN 
program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 
1 0, M=5, N^-4 and a comparison of both strands. For amino acid sequences, the BLASTP program 
uses as defaults a wordlength of 3, and expectations (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff; Proc. NatL Acad. Sci. USA 89:10915, 1989) alignments (B) of 50, 
expectation (E) of 10, M=5, N= -4, and a comparison of both strands. 

Please replace the paragraph of page 62, lines 1 6 to 27, with the following 
amended paragraph: 
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The BLAST programs identify homologous sequences by identifying similar 
segments, which are referred to herein as 'Tiigh-scoring segment pairs," between a query amino 
or nucleic acid sequence and a test sequence which is preferably obtained from a protein or 
nucleic acid sequence database. High-scoring segment pairs are preferably identified (i.e., 
aligned) by means of a scoring matrix, many of which are known in the art. Preferably, the 
scoring matrix used is the BLOSUM62 matrix (Gonnet et aL, Science 256:1443-1445, 1992; 
Henikoff and Henikoff, Proteins 17:49-61 , 1993). Less preferably the PAM or PAM250 
matrices may also be used (see, e.g., Schwartz and Dayhoff, eds., 1978, Matrices for Detecting 
Distance Relationships: Atlas of Protein Sequence and Structure,, Washington: National 
Biomedical Research Foundation). BLAST programs are accessible through the U.S. National 
Library of Medicine, e.g., at [[ www_ncbi.ntm.nih,|gov) 1. 

Please replace the paragraph of page 67, lines 9 to 20, with the following 
amended paragraph: 

Figure 5 is a flow diagram illustrating one embodiment of an identifier process 
300 for detecting the presence of a feature in a sequence. The process 300 begins at a start state 
302 and then moves to a state 304 wherein a first sequence that is to be checked for features is 
stored to a memory 1 1 5 in the computer system 1 00. The process 300 then moves to a state 306 
wherein a database of sequence features is opened. Such a database would include a list of each 
feature's attributes along with the name of the feature. For example, a feature name could be 
"Initiation Codon" and the attribute would be "ATG". Another example would be the feature 
name 'TAATAA Box" and the feature attribute would be 'TAATAA". An example of such a 
database is produced by the University of Wisconsin Genetics Computer Group 
[[(www.gcg.com)]]. Alternatively, the features may be structural polypeptide motifs such as 
alpha helices, beta sheets, or functional polypeptide motifs such as enzymatic active sites, helix- 
turn-helix motifs or other motifs known to those skilled in the art. 
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